Attribute-Value Selection Based on Minimum Description Length

نویسندگان

  • Istvan Jonyer
  • Lawrence B. Holder
  • Diane J. Cook
چکیده

We introduce a new method for attribute value selection, which is driven by the minimum description length principle. We demonstrate the viability of the approach on the Wisconsin breast cancer data set, show a working exa mple and evaluate the approach against earlier systems. Comparisons on different domains are also given. Empirical results show that our approach consistently outperforms competing machine learning algorithms on domains with all numeric, all discrete and mixed attributes types.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attribute Value Selection Considering the Minimum Description Length Approach and Feature Granularity

In this paper we introduce a new approach to automatic attribute and granularity selection for building optimum regression trees. The method is based on the minimum description length principle (MDL) and aspects of granular computing. The approach is verified by giving an example using a data set which is extracted and preprocessed from an operational information system of the Components Toolsh...

متن کامل

Bayesian Models to Assess Risk of Corruption of Federal Management Units

This paper presents a data mining project that generated Bayesian models to assess risk of corruption of federal management units. With thousands of extracted features related to corruptibility, the data were processed using techniques like correlation analysis and variance per class. We also compared two different discretization methods: Minimum Description Length Principle (MDLP) and Class-At...

متن کامل

The Cruncher: Automatic Concept Formation Using Minimum Description Length

We present The Cruncher, a simple representation framework and algorithm based on minimum description length for automatically forming an ontology of concepts from attribute-value data sets. Although unsupervised, when The Cruncher is applied to an animal data set, it produces a nearly zoologically accurate categorization. We demonstrate The Cruncher’s utility for finding useful macro-actions i...

متن کامل

Adjusting the Spanner: Testing an Evidence Accumulation Model of Decision Making

An experiment examined two aspects of performance in a multi-attribute inference task: i) the effect of stimulus presentation format (image or text) on the adoption of decision strategies; and ii) the ability of an evidence accumulation model, which unifies take-the-best (TTB) and rational (RAT) strategies, to explain participants’ judgments. Presentation format had no significant effect on str...

متن کامل

MDL-Based Unsupervised Attribute Ranking

In the present paper we propose an unsupervised attribute ranking method based on evaluating the quality of clustering that each attribute produces by partitioning the data into subsets according to its values. We use the Minimum Description Length (MDL) principle to evaluate the quality of clustering and describe an algorithm for attribute ranking and a related clustering algorithm. Both algor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004